Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 50
Filtrar
1.
IEEE Trans Cybern ; PP2024 Apr 05.
Artigo em Inglês | MEDLINE | ID: mdl-38578861

RESUMO

The utilization of robots in computer, communication, and consumer electronics (3C) assembly has the potential to significantly reduce labor costs and enhance assembly efficiency. However, many typical scenarios in 3C assembly, such as the assembly of flexible printed circuits (FPCs), involve complex manipulations with long-horizon steps and high-precision requirements that cannot be effectively accomplished through manual programming or conventional skill-learning methods. To address this challenge, this article proposes a learning-based framework for the acquisition of complex 3C assembly skills assisted by a multimodal digital-twin environment. First, we construct a fully equivalent digital-twin environment based on the real-world counterpart, equipped with visual, tactile force, and proprioception information, and then collect multimodal demonstration data using virtual reality (VR) devices. Next, we construct a skill knowledge base through multimodal skill parsing of demonstration data, resulting in primitive policy sequences for achieving 3C assembly tasks. Finally, we train primitive policies via a combination of curriculum learning, residual reinforcement learning, and domain randomization methods and transfer the learned skill from the digital-twin environment to the real-world environment. The experiments are conducted to verify the effectiveness of our proposed method.

2.
Artigo em Inglês | MEDLINE | ID: mdl-38300770

RESUMO

Hierarchical reinforcement learning (HRL) exhibits remarkable potential in addressing large-scale and long-horizon complex tasks. However, a fundamental challenge, which arises from the inherently entangled nature of hierarchical policies, has not been understood well, consequently compromising the training stability and exploration efficiency of HRL. In this article, we propose a novel HRL algorithm, high-level model approximation (HLMA), presenting both theoretical foundations and practical implementations. In HLMA, a Planner constructs an innovative high-level dynamic model to predict the k -step transition of the Controller in a subtask. This allows for the estimation of the evolving performance of the Controller. At low level, we leverage the initial state of each subtask, transforming absolute states into relative deviations by a designed operator as Controller input. This approach facilitates the reuse of subtask domain knowledge, enhancing data efficiency. With this designed structure, we establish the local convergence of each component within HLMA and subsequently derive regret bounds to ensure global convergence. Abundant experiments conducted on complex locomotion and navigation tasks demonstrate that HLMA surpasses other state-of-the-art single-level RL and HRL algorithms in terms of sample efficiency and asymptotic performance. In addition, thorough ablation studies validate the effectiveness of each component of HLMA.

3.
Soft Robot ; 2024 Feb 21.
Artigo em Inglês | MEDLINE | ID: mdl-38386776

RESUMO

Teleoperation in soft robotics can endow soft robots with the ability to perform complex tasks through human-robot interaction. In this study, we propose a teleoperated anthropomorphic soft robot hand with variable degrees of freedom (DOFs) and a metamorphic palm. The soft robot hand consists of four pneumatic-actuated fingers, which can be heated to tune stiffness. A metamorphic mechanism was actuated to morph the hand palm by servo motors. The human fingers' DOF, gesture, and muscle stiffness were collected and mapped to the soft robotic hand through the sensory feedback from surface electromyography devices on the jib. The results show that the proposed soft robot hand can generate a variety of anthropomorphic configurations and can be remotely controlled to perform complex tasks such as primitively operating the cell phone and placing the building blocks. We also show that the soft hand can grasp a target through the slit by varying the DOFs and stiffness in a trail.

4.
Neural Netw ; 172: 106075, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38278092

RESUMO

The SSVEP-based paradigm serves as a prevalent approach in the realm of brain-computer interface (BCI). However, the processing of multi-channel electroencephalogram (EEG) data introduces challenges due to its non-Euclidean characteristic, necessitating methodologies that account for inter-channel topological relations. In this paper, we introduce the Dynamic Decomposition Graph Convolutional Neural Network (DDGCNN) designed for the classification of SSVEP EEG signals. Our approach incorporates layerwise dynamic graphs to address the oversmoothing issue in Graph Convolutional Networks (GCNs), employing a dense connection mechanism to mitigate the gradient vanishing problem. Furthermore, we enhance the traditional linear transformation inherent in GCNs with graph dynamic fusion, thereby elevating feature extraction and adaptive aggregation capabilities. Our experimental results demonstrate the effectiveness of proposed approach in learning and extracting features from EEG topological structure. The results shown that DDGCNN outperforms other state-of-the-art (SOTA) algorithms reported on two datasets (Dataset 1: 54 subjects, 4 targets, 2 sessions; Dataset 2: 35 subjects, 40 targets). Additionally, we showcase the implementation of DDGCNN in the context of synchronized BCI robotic fish control. This work represents a significant advancement in the field of EEG signal processing for SSVEP-based BCIs. Our proposed method processes SSVEP time domain signals directly as an end-to-end system, making it easy to deploy. The code is available at https://github.com/zshubin/DDGCNN.


Assuntos
Interfaces Cérebro-Computador , Humanos , Potenciais Evocados Visuais , Redes Neurais de Computação , Algoritmos , Eletroencefalografia/métodos , Estimulação Luminosa
5.
Artigo em Inglês | MEDLINE | ID: mdl-38190682

RESUMO

The label transition matrix has emerged as a widely accepted method for mitigating label noise in machine learning. In recent years, numerous studies have centered on leveraging deep neural networks to estimate the label transition matrix for individual instances within the context of instance-dependent noise. However, these methods suffer from low search efficiency due to the large space of feasible solutions. Behind this drawback, we have explored that the real murderer lies in the invalid class transitions, that is, the actual transition probability between certain classes is zero but is estimated to have a certain value. To mask the invalid class transitions, we introduced a human-cognition-assisted method with structural information from human cognition. Specifically, we introduce a structured transition matrix network (STMN) designed with an adversarial learning process to balance instance features and prior information from human cognition. The proposed method offers two advantages: 1) better estimation effectiveness is obtained by sparing the transition matrix and 2) better estimation accuracy is obtained with the assistance of human cognition. By exploiting these two advantages, our method parametrically estimates a sparse label transition matrix, effectively converting noisy labels into true labels. The efficiency and superiority of our proposed method are substantiated through comprehensive comparisons with state-of-the-art methods on three synthetic datasets and a real-world dataset. Our code will be available at https://github.com/WheatCao/STMN-Pytorch.

6.
Artigo em Inglês | MEDLINE | ID: mdl-37756172

RESUMO

The classification problem for short time-window steady-state visual evoked potentials (SSVEPs) is important in practical applications because shorter time-window often means faster response speed. By combining the advantages of the local feature learning ability of convolutional neural network (CNN) and the feature importance distinguishing ability of attention mechanism, a novel network called AttentCNN is proposed to further improve the classification performance for short time-window SSVEP. Considering the frequency-domain features extracted from short time-window signals are not obvious, this network starts with the time-domain feature extraction module based on the filter bank (FB). The FB consists of four sixth-order Butterworth filters with different bandpass ranges. Then extracted multimodal features are aggregated together. The second major module is a set of residual squeeze and excitation blocks (RSEs) that has the ability to improve the quality of extracted features by learning the interdependence between features. The final major module is time-domain CNN (tCNN) that consists of four CNNs for further feature extraction and followed by a fully connected (FC) layer for output. Our designed networks are validated over two large public datasets, and necessary comparisons are given to verify the effectiveness and superiority of the proposed network. In the end, in order to demonstrate the application potential of the proposed strategy in the medical rehabilitation field, we design a novel five-finger bionic hand and connect it to our trained network to achieve the control of bionic hand by human brain signals directly. Our source codes are available on Github: https://github.com/JiannanChen/AggtCNN.git.

7.
IEEE Trans Cybern ; PP2023 Sep 15.
Artigo em Inglês | MEDLINE | ID: mdl-37713227

RESUMO

Robotic rigid contact-rich manipulation in an unstructured dynamic environment requires an effective resolution for smart manufacturing. As the most common use case for the intelligence industry, a lot of studies based on reinforcement learning (RL) algorithms have been conducted to improve the performances of single peg-in-hole assembly. However, existing RL methods are difficult to apply to multiple peg-in-hole issues due to more complicated geometric and physical constraints. In addition, previously limited solutions for multiple peg-in-hole assembly are hard to transfer into real industrial scenarios flexibly. To effectively address these issues, this work designs a novel and more challenging multiple peg-in-hole assembly setup by using the advantage of the Industrial Metaverse. We propose a detailed solution scheme to solve this task. Specifically, multiple modalities, including vision, proprioception, and force/torque, are learned as compact representations to account for the complexity and uncertainties and improve the sample efficiency. Furthermore, RL is used in the simulation to train the policy, and the learned policy is transferred to the real world without extra exploration. Domain randomization and impedance control are embedded into the policy to narrow the gap between simulation and reality. Evaluation results demonstrate the effectiveness of the proposed solution, showcasing successful multiple peg-in-hole assembly and generalization across different object shapes in real-world scenarios.

8.
Artigo em Inglês | MEDLINE | ID: mdl-37494169

RESUMO

It has been discovered that graph convolutional networks (GCNs) encounter a remarkable drop in performance when multiple layers are piled up. The main factor that accounts for why deep GCNs fail lies in oversmoothing, which isolates the network output from the input with the increase of network depth, weakening expressivity and trainability. In this article, we start by investigating refined measures upon DropEdge-an existing simple yet effective technique to relieve oversmoothing. We term our method as DropEdge ++ for its two structure-aware samplers in contrast to DropEdge: layer-dependent (LD) sampler and feature-dependent (FD) sampler. Regarding the LD sampler, we interestingly find that increasingly sampling edges from the bottom layer yields superior performance than the decreasing counterpart as well as DropEdge. We theoretically reveal this phenomenon with mean-edge-number (MEN), a metric closely related to oversmoothing. For the FD sampler, we associate the edge sampling probability with the feature similarity of node pairs and prove that it further correlates the convergence subspace of the output layer with the input features. Extensive experiments on several node classification benchmarks, including both full-and semi-supervised tasks, illustrate the efficacy of DropEdge ++ and its compatibility with a variety of backbones by achieving generally better performance over DropEdge and the no-drop version.

9.
Biomimetics (Basel) ; 8(3)2023 Jul 24.
Artigo em Inglês | MEDLINE | ID: mdl-37504216

RESUMO

Myoelectric control for prosthetic hands is an important topic in the field of rehabilitation. Intuitive and intelligent myoelectric control can help amputees to regain upper limb function. However, current research efforts are primarily focused on developing rich myoelectric classifiers and biomimetic control methods, limiting prosthetic hand manipulation to simple grasping and releasing tasks, while rarely exploring complex daily tasks. In this article, we conduct a systematic review of recent achievements in two areas, namely, intention recognition research and control strategy research. Specifically, we focus on advanced methods for motion intention types, discrete motion classification, continuous motion estimation, unidirectional control, feedback control, and shared control. In addition, based on the above review, we analyze the challenges and opportunities for research directions of functionality-augmented prosthetic hands and user burden reduction, which can help overcome the limitations of current myoelectric control research and provide development prospects for future research.

10.
IEEE Trans Pattern Anal Mach Intell ; 45(10): 11948-11960, 2023 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-37195849

RESUMO

In this paper, we propose a novel Knowledge-based Embodied Question Answering (K-EQA) task, in which the agent intelligently explores the environment to answer various questions with the knowledge. Different from explicitly specifying the target object in the question as existing EQA work, the agent can resort to external knowledge to understand more complicated question such as "Please tell me what are objects used to cut food in the room?", in which the agent must know the knowledge such as "knife is used for cutting food". To address this K-EQA problem, a novel framework based on neural program synthesis reasoning is proposed, where the joint reasoning of the external knowledge and 3D scene graph is performed to realize navigation and question answering. Especially, the 3D scene graph can provide the memory to store the visual information of visited scenes, which significantly improves the efficiency for the multi-turn question answering. Experimental results have demonstrated that the proposed framework is capable of answering more complicated and realistic questions in the embodied environment. The proposed method is also applicable to multi-agent scenarios.

11.
Entropy (Basel) ; 25(4)2023 Apr 14.
Artigo em Inglês | MEDLINE | ID: mdl-37190445

RESUMO

Autonomous indoor service robots are affected by multiple factors when they are directly involved in manipulation tasks in daily life, such as scenes, objects, and actions. It is of self-evident importance to properly parse these factors and interpret intentions according to human cognition and semantics. In this study, the design of a semantic representation framework based on a knowledge graph is presented, including (1) a multi-layer knowledge-representation model, (2) a multi-module knowledge-representation system, and (3) a method to extract manipulation knowledge from multiple sources of information. Moreover, with the aim of generating semantic representations of entities and relations in the knowledge base, a knowledge-graph-embedding method based on graph convolutional neural networks is proposed in order to provide high-precision predictions of factors in manipulation tasks. Through the prediction of action sequences via this embedding method, robots in real-world environments can be effectively guided by the knowledge framework to complete task planning and object-oriented transfer.

12.
IEEE Trans Pattern Anal Mach Intell ; 45(7): 8861-8873, 2023 07.
Artigo em Inglês | MEDLINE | ID: mdl-37021866

RESUMO

Adversarial attacks can easily fool object recognition systems based on deep neural networks (DNNs). Although many defense methods have been proposed in recent years, most of them can still be adaptively evaded. One reason for the weak adversarial robustness may be that DNNs are only supervised by category labels and do not have part-based inductive bias like the recognition process of humans. Inspired by a well-known theory in cognitive psychology - recognition-by-components, we propose a novel object recognition model ROCK (Recognizing Object by Components with human prior Knowledge). It first segments parts of objects from images, then scores part segmentation results with predefined human prior knowledge, and finally outputs prediction based on the scores. The first stage of ROCK corresponds to the process of decomposing objects into parts in human vision. The second stage corresponds to the decision process of the human brain. ROCK shows better robustness than classical recognition models across various attack settings. These results encourage researchers to rethink the rationality of currently widely-used DNN-based object recognition models and explore the potential of part-based models, once important but recently ignored, for improving robustness.


Assuntos
Algoritmos , Redes Neurais de Computação , Humanos , Encéfalo , Percepção Visual
13.
Neural Comput ; 35(5): 958-976, 2023 Apr 18.
Artigo em Inglês | MEDLINE | ID: mdl-36944244

RESUMO

Visual navigation involves a movable robotic agent striving to reach a point goal (target location) using vision sensory input. While navigation with ideal visibility has seen plenty of success, it becomes challenging in suboptimal visual conditions like poor illumination, where traditional approaches suffer from severe performance degradation. We propose E3VN (echo-enhanced embodied visual navigation) to effectively perceive the surroundings even under poor visibility to mitigate this problem. This is made possible by adopting an echoer that actively perceives the environment via auditory signals. E3VN models the robot agent as playing a cooperative Markov game with that echoer. The action policies of robot and echoer are jointly optimized to maximize the reward in a two-stream actor-critic architecture. During optimization, the reward is also adaptively decomposed into the robot and echoer parts. Our experiments and ablation studies show that E3VN is consistently effective and robust in point goal navigation tasks, especially under nonideal visibility.

14.
IEEE Trans Neural Netw Learn Syst ; 34(10): 7567-7577, 2023 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-35157591

RESUMO

This article investigates the robust adaptive learning control for space robots with target capturing. Based on the momentum conservation theory, the impact dynamics is constructed to derive the relationship of generalized velocity in the pre-impact and post-impact phase. Considering the nonlinear dynamics with contact impact, the robust control using nonsingular terminal sliding mode (NTSM) and fast NTSM is designed to achieve the fast realization of the desired states. Furthermore, for the unknown dynamics of the combination system after capturing a target, the adaptive learning control is developed based on neural network and disturbance observer. Through the serial-parallel estimation model, the prediction error is constructed for the update of adaptive law. The system signals involved in the Lyapunov function are proved to be bounded and the sliding mode surface converges in finite time. Simulation studies present the desired tracking and learning performance.

15.
IEEE Trans Neural Netw Learn Syst ; 34(12): 9981-9991, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-35412991

RESUMO

This article aims to studying how to solve dynamic Sylvester quaternion matrix equation (DSQME) using the neural dynamic method. In order to solve the DSQME, the complex representation method is first adopted to derive the equivalent dynamic Sylvester complex matrix equation (DSCME) from the DSQME. It is proven that the solution to the DSCME is the same as that of the DSQME in essence. Then, a state-of-the-art neural dynamic method is presented to generate a general dynamic-varying parameter zeroing neural network (DVPZNN) model with its global stability being guaranteed by the Lyapunov theory. Specifically, when the linear activation function is utilized in the DVPZNN model, the corresponding model [termed linear DVPZNN (LDVPZNN)] achieves finite-time convergence, and a time range is theoretically calculated. When the nonlinear power-sigmoid activation function is utilized in the DVPZNN model, the corresponding model [termed power-sigmoid DVPZNN (PSDVPZNN)] achieves the better convergence compared with the LDVPZNN model, which is proven in detail. Finally, three examples are presented to compare the solution performance of different neural models for the DSQME and the equivalent DSCME, and the results verify the correctness of the theories and the superiority of the proposed two DVPZNN models.

16.
IEEE Trans Neural Netw Learn Syst ; 34(9): 5926-5936, 2023 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-34932488

RESUMO

This article studies the robust intelligent control for the longitudinal dynamics of flexible hypersonic flight vehicle with input dead zone. Considering the different time-scale characteristics among the system states, the singular perturbation decomposition is employed to transform the rigid-elastic coupling model into the slow dynamics and the fast dynamics. For the slow dynamics with unknown system nonlinearities, the robust neural control is constructed using the switching mechanism to achieve the coordination between robust design and neural learning. For the time-varying control gain caused by unknown dead-zone input, the stable control is presented with an adaptive estimation design. For the fast dynamics, the sliding mode control is constructed to make the elastic modes stable and convergent. The elevator deflection is obtained by combining the two control signals. The stability of the dynamics is analyzed through the Lyapunov approach and the system tracking errors are bounded. The simulation is conducted to demonstrate the effectiveness of the proposed approach.

17.
IEEE Trans Pattern Anal Mach Intell ; 45(1): 722-737, 2023 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-35104214

RESUMO

The rich content in various real-world networks such as social networks, biological networks, and communication networks provides unprecedented opportunities for unsupervised machine learning on graphs. This paper investigates the fundamental problem of preserving and extracting abundant information from graph-structured data into embedding space without external supervision. To this end, we generalize conventional mutual information computation from vector space to graph domain and present a novel concept, Graphical Mutual Information (GMI), to measure the correlation between input graph and hidden representation. Except for standard GMI which considers graph structures from a local perspective, our further proposed GMI++ additionally captures global topological properties by analyzing the co-occurrence relationship of nodes. GMI and its extension exhibit several benefits: First, they are invariant to the isomorphic transformation of input graphs-an inevitable constraint in many existing methods; Second, they can be efficiently estimated and maximized by current mutual information estimation methods; Lastly, our theoretical analysis confirms their correctness and rationality. With the aid of GMI, we develop an unsupervised embedding model and adapt it to the specific anomaly detection task. Extensive experiments indicate that our GMI methods achieve promising performance in various downstream tasks, such as node classification, link prediction, and anomaly detection.

18.
IEEE Trans Neural Netw Learn Syst ; 34(9): 5452-5463, 2023 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-35767493

RESUMO

Multifingered hand dexterous manipulation is quite challenging in the domain of robotics. One remaining issue is how to achieve compliant behaviors. In this work, we propose a human-in-the-loop learning-control approach for acquiring compliant grasping and manipulation skills of a multifinger robot hand. This approach takes the depth image of the human hand as input and generates the desired force commands for the robot. The markerless vision-based teleoperation system is used for the task demonstration, and an end-to-end neural network model (i.e., TeachNet) is trained to map the pose of the human hand to the joint angles of the robot hand in real-time. To endow the robot hand with compliant human-like behaviors, an adaptive force control strategy is designed to predict the desired force control commands based on the pose difference between the robot hand and the human hand during the demonstration. The force controller is derived from a computational model of the biomimetic control strategy in human motor learning, which allows adapting the control variables (impedance and feedforward force) online during the execution of the reference joint angles. The simultaneous adaptation of the impedance and feedforward profiles enables the robot to interact with the environment compliantly. Our approach has been verified in both simulation and real-world task scenarios based on a multifingered robot hand, that is, the Shadow Hand, and has shown more reliable performances than the current widely used position control mode for obtaining compliant grasping and manipulation behaviors.

19.
IEEE Trans Pattern Anal Mach Intell ; 45(5): 5481-5496, 2023 May.
Artigo em Inglês | MEDLINE | ID: mdl-36178992

RESUMO

Multimodal fusion and multitask learning are two vital topics in machine learning. Despite the fruitful progress, existing methods for both problems are still brittle to the same challenge-it remains dilemmatic to integrate the common information across modalities (resp. tasks) meanwhile preserving the specific patterns of each modality (resp. task). Besides, while they are actually closely related to each other, multimodal fusion and multitask learning are rarely explored within the same methodological framework before. In this paper, we propose Channel-Exchanging-Network (CEN) which is self-adaptive, parameter-free, and more importantly, applicable for multimodal and multitask dense image prediction. At its core, CEN adaptively exchanges channels between subnetworks of different modalities. Specifically, the channel exchanging process is self-guided by individual channel importance that is measured by the magnitude of Batch-Normalization (BN) scaling factor during training. For the application of dense image prediction, the validity of CEN is tested by four different scenarios: multimodal fusion, cycle multimodal fusion, multitask learning, and multimodal multitask learning. Extensive experiments on semantic segmentation via RGB-D data and image translation through multi-domain input verify the effectiveness of CEN compared to state-of-the-art methods. Detailed ablation studies have also been carried out, which demonstrate the advantage of each component we propose. Our code is available at https://github.com/yikaiw/CEN.

20.
Front Bioeng Biotechnol ; 10: 1016598, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36246357

RESUMO

Although intelligent technologies has facilitated the development of precise orthopaedic, simple internal fixation, ligament reconstruction or arthroplasty can only relieve pain of patients in short-term. To achieve the best recover of musculoskeletal injuries, three bottlenecks must be broken through, which includes scientific path planning, bioactive implants and personalized surgical channels building. As scientific surgical path can be planned and built by through AI technology, 4D printing technology can make more bioactive implants be manufactured, and variable structures can establish personalized channels precisely, it is possible to achieve satisfied and effective musculoskeletal injury recovery with the progress of multi-layer intelligent technologies (MLIT).

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...